Classification of English language learner writing errors using a parallel corpus with SVM

نویسندگان

  • Brendan Flanagan
  • Chengjiu Yin
  • Takahiko Suzuki
  • Sachio Hirokawa
چکیده

In order to overcome mistakes, learners need feedback to prompt reflection on their errors. This is a particularly important issue in education systems, as the system effectiveness in finding errors or mistakes could have an impact on learning. Finding errors is essential to providing appropriate guidance in order for learners to overcome their flaws. Traditionally the task of finding errors in writing takes time and effort. The authors of this paper have a long-term research goal of creating tools for learners, especially autonomous learners, to enable them to be more aware of their errors and provide a way to reflect on the errors. As a part of this research, we propose the use of a classifier to automatically analyse and determine the errors in foreign language writing. For the experiment in this paper we collected random sentences from the Lang-8 website that had been written by foreign language learners. Using predefined error categories, we manually classified the sentences to use as machine learning training data. This was then used to train a classifier by applying SVM machine learning to the training data. As the manual classification of training data takes time, it is intended that the classifier would be used to accelerate the process used for generating further training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

How textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs

Many  elements  contribute  to  the  relative  difficulty  in  acquiring  specific  aspects  of  English  as  a foreign  language  (Goldschneider  &  DeKeyser,  2001).  Modal  auxiliary  verbs  (e.g.  could,  might), are  examples  of  a  structure  that  is  difficult  for  many  learners.  Not  only  are  they  particularly complex  semantically,  but  especially  in  the  Malaysian  context ...

متن کامل

Metadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners

Different issues have been probed in learner corpus research since the late 1980s.However, taking the im- portance of meta discourse markers (MDMs) in signposting academic discourse, their use in Iranian EFL learners‟ academic essays is an area of research in need of a more serious analysis. Contributing to this line of investigation, this paper reports a corpus-based study of the use of MDMs i...

متن کامل

Error Analysis of Taiwanese University Students’ English Essay Writing: A Longitudinal Corpus Study

Writing is considered one of the most difficult skills in EFL/ESL. Thus, meticulous recognition and classification of students’ errors in certain contexts is a worthwhile endeavor which provides us with both diagnostic and prognostic power. Accordingly, a total of 430 students in 15 English writing classes held during 12 consecutive semesters in a private university in central Taiwan were the s...

متن کامل

Learning with Learner Corpora: using the TLE for Native Language Identification

This study investigates the usefulness of the Treebank of Learner English (TLE) when applied to the task of Native Language Identification (NLI). The TLE is effectively a parallel corpus of Standard/Learner English, as there are two versions; one based on original learner essays, and the other an error-corrected version. We use the corpus to explore how useful a parser trained on ungrammatical ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • I. J. Knowledge and Web Intelligence

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2014